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ABSTRACT 

This  article  deals  with  necessary  conditions  tor  problems  in  the 
calculus  of  variations  that  incorporate  inequality  constraints  of  the  form 

f(x.§)  1 0 . ^ 

Heretofore,  such  problems  have  been  reduced  to  the  equality  case  Dy  a 
method  due  to  F.  A.  Valentine.  ^It  is  shown  that  by  avoiding  this  transi- 
tion and  treating  these  problems  directly,  the  classical  multiplier  rule 
can  be  obtained  under  significantly  weaker  regularity  and  rank  hypotheses. 

Besides  extending  the  known  results  in  the  case  of  smooth  data,  the 
present  work  generalizes  the  multiplier  rule  to  nondifferentiable  functions. 
In  § 3 we  resurrect  Queen  Dido  in  order  to  present  an  example  of  a varia- 
tional problem  involving  a nondifferentiable  function. 
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INEQUALITY  CONSTRAINTS  IN  THE  CALCULUS  OF  VARIATIONS 

Frank  H.  Clarke 

1.  Introduction.  The  classical  multiplier  rule. 

The  purpose  of  this  section  is  to  review  the  multiplier  rule  in  order 
to  place  the  results  of  this  report  in  perspective.  Let  us  begin  by 
considering  the  following  problem  of  Mayer  in  the  calculus  of  variations: 
we  seek  to  minimize 

(1.1)  Wx(D) 

over  a class  of  functions  x : [ 0,  1 J — Rn,  subject  to  the  boundary 
conditions 

(1.2)  x(0)  c CQ,  x(  1)  c Cj 

, 

as  well  as  the  equality  constraints 

I 

(1.3)  f.(x(t),  x(t))  = 0 (i  = 1,  2,  . . . , r;  t * [0,  1 ])  / 

* 

In  the  above,  the  functions  <p  and  f.  and  the  sets  CQ  and  C^ 
are  given;  we  leave  unspecified  for  now  the  class  of  functions  x 
admitted  to  competition,  as  well  as  other  details.  Let  us  mention  the 
well-known  fact  that  superficially  different  problems  involving  the 
minimization  of  integrals  can  be  reshaped  to  fit  the  above  mould  (see 
[ 10,  Chapter  b )). 

J 
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Suppose  now  that  the  function  z solves  this  problem.  The 


"multiplier  rule"  is  a theorem  stating  that,  under  suitable  hypotheses, 
there  exist  functions  (i  = 1,  2,  . . . , r)  not  all  zero  (these  are  the 
"Lagrange  multipliers")  such  that  z satisfies  the  Euler  equation  for 
the  minimization  of  the  integral 

1 

/ I\f(x,  x)dt 
0 

(summations  are  from  1 to  r).  That  is,  the  following  differential  equation 
holds: 

(1.  4)  ^ {Z  XiD2fi(z,  i))  = Z \iD1fi(z,  i)  . 

(Dj  and  denote  differentiation  of  f(x,  x)  with  respect  to  the  x 

and  x variables  respectively. ) 

The  proof  of  the  multiplier  rule  was  finally  completed  by  Hilbert 
following  the  contributions  of  many  mathematicians  (see  [ 2}  for  historical 
details).  It  turns  out  that  the  main  requirement  to  assure  its  validity 
is  the  following: 

the  vectors  D f.(z,  z)  in  Rn  are  linearly 

(1-5) 

independent  for  each  t. 

Consider  now  a different  problem,  where  instead  of  the  equality 
constraints  (1.  3)  being  imposed,  we  have  the  inequality  constraints 
(1.  6)  ^(x.x)  <0  (l  = 1,2,  . . . , r)  . 
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Forty  years  ago,  F.  A.  Valentine  [11]  proposed  a method  (called 
that  of  "slack  variables")  whereby  this  problem  could  be  treated  by  the 
existing  theory  for  the  case  of  equality  constraints;  ever  since,  it  is 
this  method  that  has  been  used  in  handling  constraints  of  the  form  (1.6) 
(see  for  example  [1],  [7]).  When  the  multiplier  rule  is  applied  to  the 
problem  via  Valentine's  method,  the  analysis  yields  as  before  a nontrivial 
set  of  satisfying  (1.4).  Additionally,  it  follows  that  the  K.  are 
nonnegative,  and  that  for  any  t such  that  f.(z,  z)  < 0 (the  constraint 
f . < 0 is  then  said  to  be  inactive),  we  have  V.(t)  = 0. 

We  stress  that  this  approach  to  the  multiplier  rule  for  inequality 
constraints  requires  (as  in  the  equality  case)  that  hypothesis  (1.  5)  be 
made  (for  the  active  indices). 

The  central  thesis  of  this  article  is  that  the  case  of  inequality 

constraints  is  best  treated  on  its  own-  For  example,  we  will  show 

(Corollary  2)  that  in  the  example  discussed  above,  hypothesis  (1.  5) 

can  be  replaced  by  the  following  weaker  condition: 

the  vectors  D f (z,  z)  (active  indices  i)  are 
(1.7)  21 

convexly  independent  for  each  t, 

by  which  we  mean  that  no  convex  combination  of  these  vectors  is  equal 
to  zero.  An  immediate  consequence  of  this  is  that  we  are  now  able  to 
treat  problems  in  which  the  number  of  (active)  inequality  constraints 
is  greater  than  the  dimension  n (this  would  be  precluded,  of  course, 
by  condition  (1.  5)),  and  possibly  infinite. 


An  equally  important  feature  of  the  results  is  that  no  differentiability 
hypotheses  intervene.  We  give  an  example  in  § 3 of  a variant  of  a 
classical  problem  in  which  a nondifferentiable  function  appears  quite 
naturally.  The  next  section  is  devoted  to  the  statement  and  elaboration 
of  the  main  result,  the  proof  of  which  is  given  in  §4. 


2.  A new  multiplier  rule- 


An  &rc  is  an  absolutely  continuous  function  x : [0,1]  — Rn.  We 
are  given  the  functions  <p  : Rn  -*  R and  f : R x R -*  R,  as  well  as 
two  subsets  CQ  and  Cj  of  R°.  The  problem  we  consider  is  the 
following:  to  minimize 
(2.1)  v»(x(l)) 

over  all  arcs  x which  satisfy 
(2.  2)  x(0)  c C0,  x(l)  « Cl 

as  well  as  the  inequality  constraint 

(2.  3)  f(x,  x)  < 0 a. e. 

The  notation  "a.e."  signifies  "for  almost  all  t in  [ 0 , 1 j ",  in  the 
sense  of  Lebesgue  measure.  The  choice  of  the  interval  [0,1]  is  merely 
a convenient  normalization. 

The  following  hypotheses  are  made  throughout:  and  Cj  are 

closed,  and  <e  and  f are  locally  Lipschitz.  The  requirement  that  <p 
(for  example)  be  locally  Lipschitz  is  equivalent  to  the  following:  for 
any  bounded  subset  B of  Rn,  there  is  a scalar  K (depending  on  B) 
such  that  for  all  x^  and  x^  in  B,  we  have 

Mxj)  - <p(x2)  I < k|x1  - x2  I . 

The  classical  multiplier  rule  is  stated  in  terms  of  derivatives. 

Since  differentiability  is  not  being  posited,  a substitute  for  derivatives 
wili  be  used.  This  is  the  "generalized  gradient"  introduced  by  the 
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author  in  [ 3)  (see  [6|  for  the  infinite-dimensional  definition).  In  the 
case  of  a locally  Lipschitz  function  g : Rn  — R,  the  generalized 
gradient  of  g at  the  point  x,  denoted  dg(x),  may  be  defined  as 
follows: 

(2. -4)  9g(x)  = co{£,  : t,  = lim  Vg(x.),  lim  x.  = x}  . 

i -*oo  i -*oo 

That  is,  we  consider  all  sequences  x.  converging  to  x such  that 
Vg(x.)  exists  for  each  i,  and  such  that  the  indicated  limit  C,  exists. 
The  convex  hull  of  all  the  points  t,  obtained  in  this  way  is  9g(x). 

It  is  evident  that  if  g is  c\  then  9g(x)  = (7g(x)}.  Furthermore, 
it  may  be  shown  that  when  g is  convex,  9g(x)  is  the  subdifferential 
of  convex  analysis  ( 8 | . 

We  now  recall  some  terminology  familiar  from  the  calculus  of 
variations.  The  arc  z is  a weak  local  minimum  in  the  above  problem 
if,  for  some  positive  e,  z solves  the  minimization  problem  (2.1)  - (2.  3) 
relative  to  the  arcs  x satisfying 

|x(t)  - z(t)  I < e,  |x(t)  - z(t)  I < e a.  e. 

The  arc  z is  piecewise- smooth  if  there  is  a partition  0 = tQ  < t^  . . . < t^ 
of  (0,1)  such  that  z exists  and  is  continuous  on  (t.  t.)  (i  = 1,2,... 

and  admits  finite  limits  at  loth  t._^  (from  the  right)  and  t.  (from  the 
left).  These  limits  are  denoted  z(t.  ^+)  and  z(t^-)  respectively. 

When  z fails  to  be  differentiable  at  a point,  z is  said  to  have  a 


corner  there. 


Definition.  For  a piecewise-smooth  arc  z,  we  say  9f  is  regular 


along  z if  the  following  condition  is  satisfied  for  all  t such  that 
f(z(t),  z(t))  = 0: 

(2.  5)  (Rn  X {0})  f)  9f(z,  z)  = 0, 

where  for  corner  points  t the  condition  is  understood  to  hold  with  z(t) 
replaced  by  both  z(t+)  and  z(t-).  Thus  9f  is  regular  along  z when 
the  x-component  of  any  element  of  9f(z,  z)  is  nonzero,  for  any  t such 
that  f(z,  z ) = 0. 

Theorem  1.  Let  the  piecewise-smooth  arc  z provide  a weak  local 
minimum  for  the  problem  (2.1)  - (2.  3),  where  9f  is  regular  along  z. 

Then  there  exist  an  arc  p,  a measurable  function  X : [0,1]  — R, 
and  a scalar  \Q  equal  to  0 or  1 such  that: 

(2.6)  (p(t),  p(t))  « X(t)9f(z(t),  z(t))  a.  e. , 

(2.7)  k(t)  > 0,  \(t)  = 0 when  f(z(t),z(t))  <0  , 

(2.8)  p(0)  is  normal  to  CQ  at  z(0)  , 

(2.9)  there  is  a vector  £ in  9«>(z(l))  such  that 
-p(l)  - Xq4  is  normal  to  at  z(l)  . 

(2.10)  |p(t)l  + XQ  is  never  zero  . 

Remark  1.  The  word  "normal"  appearing  in  the  "transversality  conditions" 
(2.8)  - (2.9)  is  used  in  a generalized  sense  defined  in  [ 3];  this  reduces 
to  the  usual  concepts  in  the  case  of  a C^-manifold  or  a convex  set.  When 
there  is  no  endpoint  constraint  (i.e.  Cj  = Rn),  it  follows  that  XQ  = 1, 
and  (2.9)  becomes 

-p(l)  * 9*>(z(l))  . 
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The  applicability  of  Theorem  1 may  at  first  appear  limited  due  to 
the  tact  that  only  the  single  inequality  constraint  (2.3)  is  considered, 
whereas  most  problems  will  incorporate  multiple  constraints.  We  shall 


see  that  in  making  the  transition  to  such  problems,  the  fact  that  f 
need  not  be  differentiable  is  crucial.  We  indicate  at  the  end  of  § 4 
the  modifications  to  be  made  in  Theorem  1 when  f has  an  explicit 
dependence  on  t. 

Let  us  now  consider  the  problem  of  minimizing  (2.1)  subject  to 
(2.2)  and  the  r inequality  constraints 

(2.11)  f.(x,  x)<0  (i  = 1,  2,  . . . , r)  . 

We  shall  suppose  that  each  f.  is  locally  Lipschitz.  Let  us  define  f 
as  follows: 

(2.12)  f(s,v)  = max  f.(s,  v)  . 

1< i<r  1 

Then  the  system  of  inequalities  (2.11)  is  equivalent  to  the  single 
inequality  (2.  3). 

Corollary  1.  Let  the  piecewise-smooth  arc  z provide  a weak  local 
minimum  for  the  problem  of  minimizing  (2.1)  subject  to  (2.2)  and  (2.11), 
and  suppose  that  for  each  t,  for  each  point  in  common 

convex  hull  of  the  sets 

af.(z,  z),  i active, 

we  have  ^ * 0.  Then  tnere  exist  an  arc  p,  measurable  functions 


\ ^ : [0,1)  * R (i  = 1,  2,  . . . , r),  and  a scalar  X^  equal  to  0 or  1 
such  that  (2.8)  - (2.10)  hold,  and  also: 

(2.13)  (p,  p)«  2 z)  a.  e.  , 

(2.14)  X^  > 0,  X.(t)  = 0 when  f.(z,  z)  <0  . 

Proof:  When  f Is  defined  by  (2.12),  the  set  9f(s,  v)  is  contained  in 
the  common  convex  hull  of  the  sets  3f  (s,  v)  over  the  indices  i for 
which  the  maximum  in  (2.12)  is  attained  [6,  Proposition  9).  It  follows 
from  this  that  df  is  regular  along  z,  so  that  Theorem  1 may  be 
applied.  Upon  invoking  a measurable  selection  theorem  (see  for  example 
[9]),  (2.  6)  yields:  there  exist  nonnegative  measurable  functions  y. 
such  that 

(p,  p)  t \(t)  Z z)  , 

and  if  f.(z,  z)  <0  then  either  \(t)  or  y.(t)  is  zero.  The  required 
conclusions  now  follow  upon  setting  X.  = Xy^.  Q.E.D. 

We  now  specialize  to  the  classic  case  of  continuous  differentiability. 
As  mentioned  in  § 1,  hypothesis  (1.  5)  is  replaced  by  the  less  restrictive  (1.7). 
Corollary  2.  Let  the  piecewise-smooth  arc  z solve  the  problem  of 
minimizing  (2.1)  subject  to  (2.2)  and  the  r inequalities  (2.11),  where 
the  functions  f.  are  C*.  Suppose  that  condition  (1. 7)  holds.  Then 
there  exist  measurable  functions  X (i  = 1,  2,  . . . , r)  and  a scalar  XQ 
equal  to  0 or  1 such  that: 

(2.  IS)  p(t)  = ZX.(t)D2f.(z,  z)  is  an  absolutely  continuous 

function  of  t satisfying  (2.8)  - (2.10), 
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(2.16)  V > 0,  \ (t)  = 0 when  f.(z,  z ) <0  , 

(2.17)  ^ {2  \.(t)D2f.(z,  z)}  = 2 X.(t)D1fi(z,  z)  . 

Proof:  It  suffices  to  apply  Corollary  1,  noting  that  generalized  gradients 
reduce  here  to  derivatives.  Q.E.D. 

Remark  2.  In  analogy  to  the  classical  case,  the  above  allows  us  to 
assert  that  the  V.  are  not  all  zero  if  no  vector  in  -9<p( z(l))  is  normal 
to  Cj  at  z(l). 

Remark  3.  There  is  a theorem  concerning  the  generalized  gradient  of 
the  upper  envelope  of  a family  of  functions  [ 3,  Theorem  2.1)  that  can 
be  used  to  derive  from  Theorem  1 a version  of  the  multiplier  rule  for  an 
infinite  number  of  constraints,  in  a manner  completely  analogous  to  that 
in  which  the  above  corollaries  were  obtained. 
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3.  Example  - Queen  Dido  and  the  badlands. 


Queen  Dido  is  given  a length  of  cord  with  which  to  enclose  a region 
along  the  shote,  the  latter  being  represented  by  the  line  x = 0 in  the 
t - x plane  (see  Figure  l).  In  doing  this,  she  seeks  to  join  the  point 
(0,  0)  to  the  point  (1,0)  by  a curve  of  length  L lying  in  the  half- 
plane x > 0 so  as  to  maximize  the  area  between  the  curve  and  the 
t-axis.  The  problem  as  described  to  this  point  is  classical,  but  let  us 
now  suppose  that  for  a given  positive  a,  the  terrain  x > a is 
inferior,  and  worth  only  half  as  much  as  the  terrain  x < a.  The  return 
corresponding  to  a choice  of  border  function  x(t)  is  then 

1 

(3.1)  / g(x(t))dt  , 

0 


where 


g(x)  = x if  x < a 

(x  + a)/2  if  x > a . 

Her  majesty  is  seeking  to  maximize  (3.1)  (or  minimize  its  negative) 
subject  to 


(3.2) 


x(0)  = 0,  x(l)  = 0 , 


(3.3) 


f Vl  + x ^ dt  = L . 
0 


Note  that  g is  Lipschitz  and  nondifferentiable. 

We  proceed  to  place  this  problem  within  the  framework  of  §2, 
Corollary  1.  We  consider  the  two  additional  variables  y and  z and 


(3.6)  x(0)  = 0,  y(0)  = 0,  z(0)  = 0,  x(l)  = 0,  z(l)  = L, 

and  we  define 

(3.  7)  *>(x(l),  y(l),  z(l))  = y(l)  . 

It  is  not  difficult  to  see  that  the  problem  of  minimizing  (3.7) 
subject  to  (3.4)  - (3.6)  is  equivalent  to  Queen  Dido's.  The  equality 
(3.  3)  has  been  replaced  by 


in  this  transition,  which  makes  no  difference  in  as  much  as  all  the 
available  cord  will  be  used.  In  fact,  it  is  clear  from  the  nature  of  the 
problem  that  both  constraints  (3.4)  and  (3.  5)  will  be  active  at  all  times. 

In  applying  Corollary  1,  note  that  the  vector  x is  here  replaced 
by  (x,  y,  z),  that  n = 3 and  r = 2.  The  sets  CQ  and  Cj  are 
{(0,0,0)}  and  {0}  XRx  { L } respectively.  The  functions  involved  are 

Lipschitz  as  required,  and  the  sets  3fj  and  a f^  are  seen  to  be: 

afj(x,  y,  z,  x,  y,  z)  = {(£,,  0,  0,  0,  -1,  0)  : * 8g(x)}  , 

df2(x,  y,  z,  x,  y,  z)  = {(0,  0,  0,  + x2,  0,-1)}  , 

from  which  we  infer  that  the  conclusions  of  Corollary  1 are  available 

l I 

I 


to  us  for  any  piecewise-smooth  solution,  which  we  shall  denote  (x,  y,  z). 


We  deduce  the  existence  of  nonnegative  functions  X^  and  v such 
that: 

the  function  p(t)  defined  by 
p(t)  = [X^x/Vl  + x2,  ^ ] 

is  absolutely  continuous,  and 

(3.8)  p(t)  c {-V1(t)3g(x)}  X (0}  X {0}  . 

It  follows  that  Vj  and  X^  are  constant.  From  (2.9)  we  obtain: 


If  \q  is  zero,  then  X^  is  zero  also,  and  it  follows  from  (2.10)  that 
X^  must  be  strictly  positive.  But  then  (3.8)  implies  that  the  sign  of 
x is  constant,  which  is  not  possible  except  in  the  degenerate  case  L = 1. 

We  may  thus  suppose  XQ  = 1 = X^.  Now  if  x^  were  zero,  (3.8) 
would  yield 

0 « 9g(x)  , 

which  is  not  possible  in  view  of  (2. 4).  Thus  X^  is  positive. 

We  have  arrived  at  the  following  conclusions:  x is  continuous 
and  satisfies  the  equation 

(3.9)  . -1A2  if  x < a 

= - 1/(  2X  2 ) if  x > a . 

Note  that  x(t)  cannot  equal  a in  any  interval,  since  zero  does  not 
belong  to  dg (a). 
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The  solutions  to  the  two  separate  cases  in  (3.9)  are  well-known, 
since  each  case  is  the  type  of  equation  that  arises  in  the  classical 
version  of  Queen  Dido's  problem.  We  find  with  no  difficulty  that  x 
describes  an  arc  of  a circle  of  radius  X^  for  x <a,  and  an  arc  of 
a circle  of  radius  2X^  for  x > a.  The  requirement  that  these  arcs 
meet  with  a common  tangent  (at  x = a)  assures  that  to  each  X^ 
there  corresponds  at  most  one  such  configuration  (see  Figure  1). 

Consequently,  the  optimal  arc  x is  uniquely  specified  once  X^ 
is  known;  X^  is  determined  by  the  condition  that  x is  of  given  length 
L.  Once  the  nature  of  x is  known  to  be  as  described  above,  it  is  an 
easy  exercise  to  obtain  (implicit)  equations  for  X^  (and  the  other 
parameters  of  the  solution).  These  relations  could  then  be  used  to 
calculate  explicitly  the  solution  x. 

It  is  interesting  to  determine  the  nature  of  the  information  contributed 
by  the  new  multiplier  rule.  Based  on  the  known  classical  solution, 
one  might  expect  the  solution  to  the  present  problem  to  consist  of  an 
amalgam  of  circular  arcs  on  either  side  of  the  line  x = a (as  indeed  it 
does).  The  multiplier  rule  has  served  to  rule  out  the  possibility  that 
x lies  along  the  line  x = a for  any  length  of  time,  and  has  yielded 
the  crucial  facts  that  the  radii  of  the  upper  and  lower  arcs  are  in  the 
ratio  of  two  to  one,  and  that  these  three  pieces  are  smoothly  joined. 

Thus  the  information  obtained  from  its  use  has  been  essentially  global. 
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•4.  Proof  of  Theorem  1- 


Tor  ease  of  notation,  we  denote  f(z(t),  z(t))  and  9f(z(t),  z(t)) 
by  f(t)  and  9f(t)  respectively.  When  t is  a corner  point,  there 
will  be  occasions  when  f(t)  is  to  be  interpreted  as  f(z(t),  z(t+))  or 
f(zit),  z(t-)),  but  the  context  will  make  this  evident.  The  open  unit 
ball  in  R^n  is  denoted  B. 

Lemma  1.  There  is  a constant  M with  the  following  property:  given 
any  t in  [0,1)  and  (s,v)  in  (z(t),  z(t))  + B,  then  for  all  (,  in 
df(s,v)  we  have  lcl<M. 

Proof:  This  follows  from  the  hypothesis  that  f is  Lipschitz  on 

bounded  sets,  and  from  the  definition  (2.4)  of  generalized  gradient.  Q.  E.  D. 

Lemma  2.  There  exist  positive  numbers  6 and  6 such  that,  for 

1 fa 

any  t in  [0,1),  for  any  (s,v)  in  (z(t),  z(t))  + 6 B,  for  any  (a,  p) 
in  9f(s,v),  we  have 

Proof:  Suppose  the  lemma  false.  Then  for  each  i = 1,  2,  ... , there 
exist  t.  m [ 0, 1 ),  (s. , v.)  in  z^))  + (l/i)B  and  (a{,  p)  in 

9f(  s^,  v.)  such  that  Ip.  | < 1/i.  By  taking  subsequences  we  may  assume 
that,  for  some  t in  [0,1),  for  some  (s,v)  and  (o,0)  in  R2n, 
we  have  t.  — t,  (s.,  v.)  -*  (s,  v),  and  (o.,  p.)  — (a,  0).  It  follows  that 
(s,  v)  = (z(t),  z(t)).  Furthermore,  by  the  upper-semicontinuity  of  the 
generalized  gradient  [ 3 ) . we  know  that  (o,0)  belongs  to  9f(t).  This 
contradicts  the  regularity  of  9f  along  z. 
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Q.  E.  D. 


Now  let  any  positive  integer  K be  given,  and  choose  so 

that,  for  any  t in  I 0,1),  the  inequality 

I (s,  v)  - (z(t),  z( t) ) I < eK 

implies 

f(s,  v)  < f(z(t),  z(t))  + 1/K. 

Such  a choice  is  possible  because  f is  uniformly  continuous  on  compact 
sets.  We  may  suppose  that  is  less  than  1/K,  and  also  less  than 

the  e occurring  in  the  definition  of  weak  local  minimum  (§2). 

Let  us  set 

AJt)  = UU  : ; < df(s,  v),  |(s,  v)  - (z(t),  z(t))  I < 1/K}  , 

and  define,  for  t such  that  f(t)  > -1/K, 

GK(t)  = AK(t)*  = {\  : v • £ £ 0 for  a11  & in  AK(t)}  • 

For  t such  that  f(t)  < -1/K,  set  G^(t)  = R n. 

Now  let  K be  larger  than  1/6  . The  following  result  then  follows 
from  Lemmas  1 and  2: 

Lemma  3.  There  is  a constant  N > 1 such  that  the  convex  cone  GK(t) 
has  the  following  property  for  each  t:  given  any  s in  R , there 
exists  v in  Rn  such  that  I v I <Nlsl  and  (s,v)«  G^(t). 

We  now  define  a multifunction  from  [0,1]  to  R as  follows: 

EK(t,  s)  = {v  : | v i < ck/2,  (s,  v)  t GK(t)}  . 

In  the  terminology  of  [ S),  it  follows  that  for  Is!  < e^/(2N),  the 


-17- 


multifunction  E (t,  s)  is  nonempty,  compact-valued,  integrably  bounded, 
measurable  in  t and  Lipschitz  in  s with  Lipschitz  constant  N. 

Lemma  4.  The  arc  x(t)  * 0 minimizes 

<p(  z(l)  + x(l)) 

over  all  arcs  x satisfying  I x(t)  I < eK/(2N)  and  the  constraints 

x(0)  c CQ  - z(0),  x(l)  * C1  - z(  1)  , 

(4.1)  x(t)  * EK(t,x(t))  a.  e. 

Proof:  Let  any  such  x be  given.  Notice  that  it  suffices  to  prove  the 
inequality 

(4.2)  f(z  + x,  z + x)  < 0 a.e., 

since  then  the  fact  that  z is  optimal  for  our  original  problem  over  a 
class  of  arcs  including  z + x yields 

<p(z(l))  < <p( z(l)  + x(l))  . 

In  proving  (4. 2),  consider  first  any  t such  that  f(t)  < -1/K.  Then  (4.2) 
follows  from  the  choice  of  e , since  we  have 

|(x(t),x(t)) ! < eK  . 

Now  let  us  consider  any  t such  that  f(t)  > -1/K.  We  have 

1 

(4.  3)  f(z(t)  + x(t),  z(t)  + x(t))  = f(t)  + f Dg(\)dK  , 

0 

where  the  Lipschitz  function  g is  defined  by 

g(\)  = f(z(t)  + \.x(t),z(t)  + \x(t))  , 
and  Dg(\)  exists  a.e.  It  now  suffices  to  prove  that  Dg(\)  is 
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non  positive  for  V in  [0,1],  since  then  (4.  3)  implies 

f(z(t)  + x(t),  z(t)  + x(t))  < f(t)  < 0 . 

In  turn,  in  order  to  prove  the  nonpositivity  of  Dg(k),  it  suffices  to 

prove  that  Dg(k)  belongs  to  the  set  (interval) 

S = 9f(z(t)  + kx(t),  z(t)  + kx(t))  • (x(t),x(t))  , 

in  view  of  the  definition  of  A^t)  and  the  fact  that  (x(t),  x(t))  belongs 

to  G„(t).  We  proceed  now  to  prove  this. 

1\ 

According  to  [3,  Proposition  1.4]  we  have  max{a  : <r  « S } = 
lim  sup[  f(z  + kx  + h + f>x,  z + kx  + h'  + 6x)  - f(z  + kx  + h,  z + kx  + h')  ]/6  , 
where  the  lim  sup  is  taken  as  h and  h'  converge  to  0 in  Rn 
and  f decreases  to  0.  By  definition,  Dg(k)  is  equal  to 

lim[  f(z  + kx  + 6x,  z + kx  + 6x)  - f(z  + kx,  z + kx ) ]/6 
(limit  as  6 decreases  to  zero),  whence 

Dg(k)  < max{cr  : <r  « S } . 

A similar  argument  with  mln{<r  : <r  * S}  shows  that  Dg(k)  belongs  to 
the  interval  S.  Q.  E.  D. 

We  now  apply  [ 5,  Theorem  2]  to  the  problem  in  the  statement  of 
Lemma  4.  If  the  function  H :[0,1]  xRnxRn  -R  is  defined  as  follows: 

H(t,  s,  p)  = max{p  • v : v € E^(t,  s)}  , 

we  deduce  that  an  arc  pK  and  a scalar  k^  equal  to  0 or  1 exist 
such  that: 

(4.4)  (-pK,  0)  c dH(t,  0,  pK)  a.  e. , 
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(4.  S) 


p„(0)  is  normal  to  C.  - z (0)  at  0 , 

N 0 

(-4-6)  for  some  vector  H 3«p(z(l)), 

“PK(1)  - is  normal  to  - z(l)  at  0 , 

(4.7)  )p  (t)  I + is  never  zero. 

Lemma  4.  For  almost  all  t, 

H.  «)  PK  • s + pK  • v < 0 for  all  (s,  v)  c G^(t)  . 

Proof:  It  suffices  to  show  this  for  |v!  small,  since  G (t)  is  a cone. 

I\ 

Let  t be  such  that  (-4.4)  holds.  Then  we  may  suppose  that  v belongs 

to  Ej_,( t,  s),  and  consequently 

(4-  9)  H(t,  s,  PK)  > pK  • v . 

It  is  elementary  to  verify  that  the  function  H(t,  x,  p)  is  concave  in  x; 
along  with  (4.4),  this  implies  that  -p^  belongs  to  the  superdifferential 
at  0 of  the  concave  function  x - H(t,  x,  pK).  From  this  we  deduce: 

10)  H(t,  s,  pK)  - H(t,  0,  pK)  < -p^  • s . 

Since  0 belongs  to  E(t,0),  it  follows  from  the  definition  of 
H that  we  have 

(4.11)  H(t,0,pK)>0. 

Now  we  combine  (4.9)  - (4. 11)  to  obtain  (4.8).  Q.  E.  D. 

Remark.  From  Lemma  5 and  the  definition  of  G„(t)  we  deduce: 

1 4. 12)  p^(t)  and  p^t)  are  zero  when  f(t)  < -1/K  . 

We  shall  now  be  consi  'ering  all  the  above  as  the  integer  K 
increases  to  infinity.  By  taking  subset, jences,  we  may  assume  that  the 
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V are  either  all  0 or  all  equal  to  1,  and  that  the  t,  converge 

to  a vector  t,.  From  the  easily  proven  fact  that  the  function  x - H(t,  x,  p) 

is  Lipschitz  with  constant  Nip  I,  along  with  (4.4),  we  deduce: 

(4.13)  I PK I < N lpK  I a.  e. , 

where  the  constant  N is  independent  of  K (since  increases 

with  K,  N can  only  decrease  as  K increases). 

Lemma  6.  There  exist  an  arc  p and  a scalar  X^  equal  to  0 or  1 
satisfying  (2.8)  - (2.10)  as  well  as: 

• * 

(4.14)  p • s + p • v < 0 for  all  (s, v)  < 8f(t)  , a.  e. , 

(4.15)  p and  p equal  0 when  f(t)  <0  . 

Proof:  Case  1:  The  X^  are  all  0.  By  scaling,  we  may  assume  that 

all  the  p are  nonvanishing  and  lip..  II  = 1 (II-  II  denotes  the 
K 1\ 

supremum  norm  on  (0,1)),  where  the  rescaled  functions  continue  to 

satisfy  ( 4. 12),  (4.13),  (4.  5)  and  (4.  6)  (with  X = 0).  In  view  of  ( 4. 1 3), 

the  Dunford-Pettis  criterion  implies  that  (p  } admits  a subsequence 

K 

converging  weakly  in  L*  to  p (say).  It  follows  for  suitable  sub- 
sequences that  p is  the  derivative  of  an  arc  p to  which  p converges 

J\ 

uniformly  (see  ( 4,  Lemma  5]  for  the  details  of  the  argument).  Since  p 
satisfies  (4.13)  and  lip  II  = 1,  (2.10)  holds  (with  XQ  = 0),  as  well 

as  (2.  8)  - (2.9).  Relation  (4.15)  is  an  immediate  consequence  of  (4.12). 

* 

In  order  to  prove  (4.14),  note  first  that  G^(t)  Increases  to  9f(t) 
for  any  t such  that  f(t)  = 0 (this  uses  the  upper  semicontinuity  of 
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of  [3]).  Furthermore,  weak  convergence  preserves  linear  inequalities 
su.h  as  (4.8);  the  result  follows. 

Case  2:  The  are  all  equal  to  1,  and  lip  (I  is  bounded.  In 

the  case  the  argument  is  unchanged,  except  that  the  need  to  rescale 

initially  is  eliminated.  The  conclusions  (2.9)  - (2.10)  hold  with  X.Q  = 1. 

Case  3:  The  are  all  equal  to  1,  and  II II  is  unbounded. 

We  may  assume  that  lip,. II  increases  to  infinity.  We  rescale  the 
arcs  p^  by  dividing  by  llp^ll  (which  is  certainly  nonzero  for  K 
large).  The  argument  then  continues  as  in  Case  1,  and  we  get  conditions 
(2.9)  and  (2.10)  with  \Q  = 0,  since  ^/^P^  converges  to  0.  Q.E.D. 

In  order  to  complete  the  proof  of  the  theorem,  it  now  suffices  to 
infer  (2. 6)  and  (2.  7)  from  (4.14)  and  (4.15).  The  condition  (4.14)  says 
that  (p,  p)  belongs  to  ( 9 f( t)  ) , which  is  the  closed  convex  cone 
generated  by  9 f( t)  - This  has  the  following  characterization,  for  any  t 
such  that  f(t)  = 0: 

(9f(t)V  = {\£,  : \ > 0,  £,  « 9f(t) } , 

because  9f(t)  is  a compact  convex  set  not  containing  zero.  Invoking 
a measurable  selection  theorem  (see  for  example  [ 9 J ) , we  obtain  (2.6) 
when  f(t)  = 0,  and  (2.  7)  follows  by  simply  setting  \(t)  = 0 when 
f(t)  < 0 and  using  (4.15).  Q.  E.  D. 

Remark.  The  case  in  which  f has  an  explicit  dependence  on  t may 
be  treated  exactly  as  above  with  the  additional  hypotheses: 

(a)  f(t,  x,  v)  is  a measurable  function  of  t for  each  (x,  v), 
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(b)  9f(t,  X,  v)  is  an  upper  semicontinuous  multifunction  (here, 

df  refers  to  the  generalized  gradient  with  respect  to  (x,  v)). 

Both  these  hypotheses  are  automatically  satisfied  when  f is 
independent  of  t.  In  the  case  of  t-dependence,  (a)  is  required  to 
ensure  that  the  multifunction  constructed  in  the  proof  is  measurable 

in  t,  while  (b)  is  necessary  for  the  conclusions  of  Lemmas  2 and  6. 
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